Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix wallet sync not finding coins of addresses which are not cached #672

Conversation

evanlinjin
Copy link
Member

@evanlinjin evanlinjin commented Jul 17, 2022

Fixes #521
Fixes #451

^ However, only for electrum-based Blockchain implementations. For RPC and Compact Block Filters, syncing works differently, and so are the bugs - I have created a separate ticket for this (#677).

Description

Previously, electrum-based blockchain implementations only synced for scriptPubKeys that are already cached in Database.

This PR introduces a feedback mechanism, that uses stop_gap and the difference between "current index" and "last active index" to determine whether we need to cache more scriptPubKeys.

The WalletSync::wallet_setup trait now may return an Error::MissingCachedScripts error which contains the number of extra scriptPubKeys to cache, in order to satisfy stop_gap for the next call.

Wallet::sync now calls WalletSync in a loop, caching in-between subsequent calls (if needed).

Notes to reviewers

  1. The caveat to this solution is that it is not the most efficient. Every call to WalletSync::wallet_setup starts polling the Electrum-based server for scriptPubKeys starting from index 0.

    However, I feel like this solution is the least "destructive" to the API of Blockchain. Also, once the bdk_core sync logic is integration, we can select specific ranges of scriptPubKeys to sync.

  2. Also note that this PR only fixes the issue for electrum-based Blockchain implementations (currently blockchain::electrum and blockchain::esplora only).

  3. Another thing to note is that, although Database assumes 1-2 keychains, the current WalletSync "feedback" only returns one number (which is interpreted as the larger "missing count" of the two keychains). This is done for simplicity, and because we are planning to only have one keychain per database in the future.

    bdk/src/blockchain/mod.rs

    Lines 157 to 161 in f0c876e

    fn wallet_sync<D: BatchDatabase>(
    &self,
    database: &mut D,
    progress_update: Box<dyn Progress>,
    ) -> Result<usize, Error> {

  4. Please have a read of Fix wallet sync not finding coins of addresses which are not cached #672 (comment) for additional context.

Checklists

All Submissions:

  • I've signed all my commits
  • I followed the contribution guidelines
  • I ran cargo fmt and cargo clippy before committing

Bugfixes:

  • This pull request breaks the existing API
  • I've added tests to reproduce the issue which are now passing
  • I'm linking the issue being fixed by this PR

@evanlinjin evanlinjin force-pushed the fix-blockchain-sync-feedback-missing-cache branch 2 times, most recently from c699aea to f0c876e Compare July 17, 2022 13:40
@evanlinjin
Copy link
Member Author

evanlinjin commented Jul 18, 2022

After talking with @LLFourn in a private conversation, it seems that the solution provided in this PR does not fully solve #521.

The problem that still exists

The problem that still exists is as follows. Given:

  • A wallet established with a keychain, which contains a transaction in which at least one input spends from an owned scriptPubKey, and at least one output sends to a different owned scriptPubKey.
  • The Database only contains the cache of one of these scriptPubKeys.

If the wallet performs a sync, and attempts to compile data (such as owned outputs, UTXOs and balances) from the transaction, it will miss including data from an owned scriptPubKey.

The fix?

Currently, we define stop_gap as the largest expected (child index) gap between two "used" scriptPubKeys (where "used" means, contained within atleast one transaction output).

I propose adding an additional constraint to the definition of stop_gap. Given a transaction, any two scriptPubKeys referenced by that transaction (either being spent, or received to) should not have a difference in index greater than stop_gap.

If we can enforce the new definition of stop_gap in code, users of BDK can have a more predictable sync/update experience, hence fixing #521.

How does everything work currently?

Analyzing the syncing logic (of electrum-based Blockchain implementations), we see that syncing is defined in 4 steps (as per defined in blockchain::Request.

/// A request for on-chain information
pub enum Request<'a, D: BatchDatabase> {
/// A request for transactions related to script pubkeys.
Script(ScriptReq<'a, D>),
/// A request for confirmation times for some transactions.
Conftime(ConftimeReq<'a, D>),
/// A request for full transaction details of some transactions.
Tx(TxReq<'a, D>),
/// Requests are finished here's a batch database update to reflect data gathered.
Finish(D::Batch),
}

The first two steps are as follows:

  1. Request::Script: First we send the electrum-based server a list of scriptPubKeys, obtaining a bunch of txids (and their confirmation times).
  2. Request::TxReq: We obtain the full raw transactions of txids (if not already exist in Database) and compile metadata of the transaction (such as sent/received amounts).

Then the final step is where a batch update is prepared for the Database, in which one of these steps is to "set every UTXO we observed" (in the above steps).

As you can see, compiling sent/received balances from transactions, as well as finding owned UTXOs of new transactions require checking against the cache of owned scriptPubKeys. If we "under-cache" before these steps, we will "miss out" on finding crucial data.

We cannot fully fix this problem (unless we check with every possible scriptPubKey - but that is ridiculously expensive in computation), so we use our new definition of stop_gap to ensure sync behaves as expected for the user.

How do we enforce the new stop_gap definition?

So we need to ensure the following are fulfilled before compiling metadata (such as new outputs, utxos and balances):

  1. We have obtained all transactions of owned scriptPubKeys (where scriptPubKeys are searched for in chronological order starting from child index 0), in which the last stop_gap+1 scriptPubKeys are not associated with any transactions.
  2. We have cached a stop_gap number of scriptPubKeys above the last scriptPubKey index we queried for during Request::Script.

What changes do we need to make?

This should now all be implemented in this commit: 12afbe3

In conclusion

Wallet sync is a difficult problem since we are either sacrificing performance for reliability, or sacrificing reliability for performance. The best thing we can do, is to make the syncing logic as predictable as we can for the developer/user.

Additionally, let's design a good API for sync logic.

@evanlinjin evanlinjin force-pushed the fix-blockchain-sync-feedback-missing-cache branch 3 times, most recently from 3625495 to 12afbe3 Compare July 19, 2022 11:25
@evanlinjin
Copy link
Member Author

I just had a thought. I feel like to improve the efficiently, the WalletSync implementation can internally self-cache txids before returning Error::MissingCachedScripts. This way, subsequent calls can use the cache to avoid re-polling the electrum-based server (for the scriptPubKeys that are already requested for).

Copy link
Member

@afilini afilini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK, a few comments.

src/blockchain/script_sync.rs Outdated Show resolved Hide resolved
src/wallet/mod.rs Outdated Show resolved Hide resolved
Comment on lines 1697 to 1702
// first run, we record progress...
let mut sync_res = if run_setup {
maybe_await!(blockchain.wallet_setup(self.database.borrow_mut().deref_mut(), progress))
} else {
maybe_await!(blockchain.wallet_sync(self.database.borrow_mut().deref_mut(), progress,))?;
maybe_await!(blockchain.wallet_sync(self.database.borrow_mut().deref_mut(), progress))
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can avoid duplicating this piece of code by moving this inside the loop, and then breaking if the descriptor is not derivable.

Something like:

for _ in 0..MAX_ROUNDS {
    let sync_res = // .... ;

    if !is_derivable {
        break;
    }

    // do everything else to get the database ready for the next iteration
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what to do with the progress, I feel like we should have our own "internal" progress (that can be cloned and reused) that wraps the real progress and maps the first iteration to a maximum of like 70%, then if there are more iterations each has a lower "weight", like the second iteration going to 0 to 100 only causes the overall progress to go from 70 to 80 maybe and then at the end we just set it to 100.

It's not great but I can't think of any other way to predict how many iterations there's going to be. We should adjust those weights based on probability, most of the times it's just one (so that gets a larger weight).

If you have better ideas let me know.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did something else to avoid duplicating code, let me if you are happy with my solution :)

Regarding the SyncProgress object... Because right now electrum-based blockchains don't actually report progress at all, and only electrum-based blockchains will return Error::MissingCachedScripts (for now), I feel like we shouldn't bother changing behavior for it in this PR?

Maybe we should do another ticket to fix progress-reporting behavior. I realized that not all sync attempts will report progress (for example, rpc only reports sync progress on it's initial sync).

@afilini afilini added this to the Release 0.21.0 Feature Freeze milestone Jul 20, 2022
@evanlinjin evanlinjin force-pushed the fix-blockchain-sync-feedback-missing-cache branch from 12afbe3 to 8a75d01 Compare July 20, 2022 14:15
Previously, electrum-based blockchain implementations only synced for
`scriptPubKey`s that are already cached in `Database`.

This PR introduces a feedback mechanism, that uses `stop_gap` and the
difference between "current index" and "last active index" to determine
whether we need to cache more `scriptPubKeys`.

The `WalletSync::wallet_setup` trait now may return an
`Error::MissingCachedScripts` error which contains the number of extra
`scriptPubKey`s to cache, in order to satisfy `stop_gap` for the next call.

`Wallet::sync` now calls `WalletSync` in a loop, cacheing inbetween
subsequent calls (if needed).
@evanlinjin evanlinjin force-pushed the fix-blockchain-sync-feedback-missing-cache branch from 8a75d01 to 5c940c3 Compare July 20, 2022 15:08
@evanlinjin evanlinjin requested a review from afilini July 20, 2022 15:21
@evanlinjin
Copy link
Member Author

@afilini @LLFourn would it make sense to try to fix the same bugs in the other Blockchain implementations in this PR?

@afilini
Copy link
Member

afilini commented Jul 21, 2022

I think it would be nice to have everything in a single PR, if you can fix those as well we can merge everything together.

@evanlinjin
Copy link
Member Author

I think it would be nice to have everything in a single PR, if you can fix those as well we can merge everything together.

Fair, I've created a separate ticket (#677) since it seems the "nature" of the bugs are a little different, but I'll try tackle them in this PR as well.

@evanlinjin
Copy link
Member Author

@afilini I started working on fixing sync for RPC... But because sync works very different, so the nature of the bug is also really different. I reckon it should be tackled in a separate PR because it feels like a completely different set of problems.

I've created a separate ticket here (#677), would it be okay to work on this in a different PR? What do you think?

@afilini
Copy link
Member

afilini commented Jul 21, 2022

Yes, that makes sense, I'll go ahead and merge this one then

Copy link
Member

@afilini afilini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 5c940c3

@afilini afilini merged commit 3644a45 into bitcoindevkit:master Jul 21, 2022
notmandatory pushed a commit to bitcoindevkit/rust-esplora-client that referenced this pull request Aug 2, 2022
…resses which are not cached

7e2ad9b Fix wallet sync not finding coins of addresses which are not cached (志宇)

Pull request description:

  Fixes #521
  Fixes #451

  ^ However, only for electrum-based `Blockchain` implementations. For RPC and Compact Block Filters, syncing works differently, and so are the bugs - I have created a separate ticket for this (#677).

  ### Description

  Previously, electrum-based blockchain implementations only synced for `scriptPubKey`s that are already cached in `Database`.

  This PR introduces a feedback mechanism, that uses `stop_gap` and the difference between "current index" and "last active index" to determine whether we need to cache more `scriptPubKeys`.

  The `WalletSync::wallet_setup` trait now may return an `Error::MissingCachedScripts` error which contains the number of extra `scriptPubKey`s to cache, in order to satisfy `stop_gap` for the next call.

  `Wallet::sync` now calls `WalletSync` in a loop, caching in-between subsequent calls (if needed).

  #### Notes to reviewers

  1. The caveat to this solution is that it is not the most efficient. Every call to `WalletSync::wallet_setup` starts polling the Electrum-based server for `scriptPubKey`s starting from index 0.

      However, I feel like this solution is the least "destructive" to the API of `Blockchain`. Also, once the `bdk_core` sync logic is integration, we can select specific ranges of `scriptPubKey`s to sync.

  2. Also note that this PR only fixes the issue for electrum-based `Blockchain` implementations (currently `blockchain::electrum` and `blockchain::esplora` only).

  3. Another thing to note is that, although `Database` assumes 1-2 keychains, the current `WalletSync` "feedback" only returns one number (which is interpreted as the larger "missing count" of the two keychains). This is done for simplicity, and because we are planning to only have one keychain per database in the future.

      https://github.com/bitcoindevkit/bdk/blob/f0c876e7bf38566c0d224cbe421ee312ffc06660/src/blockchain/mod.rs#L157-L161

  4. Please have a read of bitcoindevkit/bdk#672 (comment) for additional context.

  ### Checklists

  #### All Submissions:

  * [x] I've signed all my commits
  * [x] I followed the [contribution guidelines](https://github.com/bitcoindevkit/bdk/blob/master/CONTRIBUTING.md)
  * [x] I ran `cargo fmt` and `cargo clippy` before committing

  #### Bugfixes:

  * [x] This pull request breaks the existing API
  * [x] I've added tests to reproduce the issue which are now passing
  * [x] I'm linking the issue being fixed by this PR

ACKs for top commit:
  afilini:
    ACK 7e2ad9b

Tree-SHA512: aee917ed4821438fc0675241432a7994603a09a77d5a72e96bad863e7cdd55a9bc6fbd931ce096fef1153905cf1b786e1d8d932dc19032d549480bcda7c75d1b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
2 participants